updated on 2025-02-05
NIMBLE in practice^, we
will run an analysis of the simpliest of all models - the “model of the
mean”.
R.If I gave you a dataset and asked you for the mean and variation, what would you do?
Probably something like mean(data) and maybe
var(data) or sd(data). Let’s go ahead and do
that:
(data.frame(summary = c('mean', 'SD', 'SE'),
value = c(round(mean(tree_diameter),1),
round(sd(tree_diameter),1),
round(sd(tree_diameter)/sqrt(length(tree_diameter)),1))))## summary value
## 1 mean 56.3
## 2 SD 13.6
## 3 SE 4.3
What if I asked you to use a model to estimate the mean and the standard error of a dataset - how could you do this?
Although you might not think of it this way, assuming normally distributed random error, we could fit a linear model to our data and use the intercept estimates to describe the mean and standard error of our dataset. Remember, the intercept of a linear model is the mean (or expected) value of our variable when our predictor(s) = 0, and with no predictors then we just have the mean of the data.
##
## Call:
## lm(formula = tree_diameter ~ 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -14.30 -10.05 -1.80 5.45 28.70
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 56.3 4.3 13.09 3.65e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.6 on 9 degrees of freedom
While finding these values is routine and simple, we’ll over
complicate it to illustrate a few key parts of Bayesian analysis. We’ll
now replicate this analysis within nimble.
Steps to our analysis:
nimble packagenimble using
nimbleCodeNIIMBLE (see frog slides)What does a model look like for this?
## nimble version 1.0.1 is loaded.
## For more information on NIMBLE and a User Manual,
## please visit https://R-nimble.org.
##
## Note for advanced users who have written their own MCMC samplers:
## As of version 0.13.0, NIMBLE's protocol for handling posterior
## predictive nodes has changed in a way that could affect user-defined
## samplers in some situations. Please see Section 15.5.1 of the User Manual.
##
## Attaching package: 'nimble'
## The following object is masked from 'package:stats':
##
## simulate
tree_model01 <- nimbleCode({
# Note: historically had to specify "precision" in BUGS/JAGS
# precision = 1/pop variance
# pop variance = pop sd*pop sd
# In Nimble, we can just specify SD, but need to be explicit in likelihood distribution
# Priors
pop.mean ~ dnorm(53, sd = 5) # need the "sd =" or have to provide precision (1/(5*5) = 0.04)
# need to specify a prior for the standard deviation of this sample.
# we could approach it several ways, but first we'll use the SD of the actual observations (we'll check out other options in subsequent models)
pop.sd <- tree_sd # we'll pass `tree_sd` in as data
# likelihood
for(i in 1:nObs){
tree[i] ~ dnorm(pop.mean, sd = pop.sd)
}
})NIMBLENIMBLEpop.mean because
taking pop.sd from actual datapop.mean as a normal distribution with mean
53 and sd 5NIMBLE an initial value of
-200.# one call
samples <- nimbleMCMC(
code = tree_model01,
constants = tree_constants,
data = tree_data,
inits = inits,
monitors = keepers,
niter = ni,
nburnin = nb,
thin = nt,
nchains = nc,
summary = T) # get jags-style summary of posterior## Defining model
## Building model
## Setting data and initial values
## Running calculate on model
## [Note] Any error reports that follow may simply reflect missing values in model variables.
## Checking model sizes and dimensions
## Checking model calculations
## Compiling
## [Note] This may take a minute.
## [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
## running chain 1...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## running chain 2...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## running chain 3...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
# function to summarize samples
getValues <- function(x){
mean <- mean(x)
sd <- sd(x)
quants <- quantile(x, probs = c(0.025, 0.5, 0.975))
out <- matrix(data = c(mean, sd, quants), nrow = 1, dimnames = list(1,c('mean',
'sd',
'lower95',
'median',
'upper95')))
return(out)
}
# .......................................................................
# .......................................................................
# First, "Summary" gives us some simple stuff
samples$summary$all.chains## Mean Median St.Dev. 95%CI_low 95%CI_upp
## pop.mean 54.82732 54.77144 3.286314 48.41007 61.43406
## pop.sd 13.59779 13.59779 0.000000 13.59779 13.59779
# this is important - let's talk through it.
# prior knowledge said 43-63
# we updated prior knowledge with new knowledge and reduced 95% CI.
# pop sd is exactly the same value as we passed it - does not get sampled - just a constant
# (HOW DOES 95% CRED INT differ from 95% CONF INT?)
# but also,
# .......................................................................
# INSPECT RESULTS
# .......................................................................
# convert to mcmc object for inspection via coda package
samples_mcmc <- coda::as.mcmc.list(lapply(samples$samples, coda::mcmc))
# Look at traceplots of the parameters
par(mfrow=c(1,2))
coda::traceplot(samples_mcmc[, 1:2])# calculate Rhat convergence diagnostic of parameters
# "Gelman-Rubin Statitsic" - compares ratio of the variation of the samples within a chain and the variation of samples when the chains are pooled; variation within and between chains should stabilize with convergence (i.e., go to 1)
# rule of thumb is anything <1.1
coda::gelman.diag(samples_mcmc[,1]) # just look at pop.mean = all good## Potential scale reduction factors:
##
## Point est. Upper C.I.
## [1,] 1 1
# extract mean and SD
samplesdf <- do.call(rbind, samples_mcmc)
pop.mean <- samplesdf[, 1]
pop.sd <- samplesdf[, 2]
getValues(pop.mean)## mean sd lower95 median upper95
## 1 54.82732 3.286314 48.41007 54.77144 61.43406
## mean sd lower95 median upper95
## 1 13.59779 0 13.59779 13.59779 13.59779
## Loading required package: coda
mcmcplot(samples$samples, dir = here::here('Modules/02_Intro_Bayes/files_for_slides'), filename = "tree_model01")##
Preparing plots for pop.mean. 50% complete.
##
Preparing plots for pop.sd. 100% complete.
MCMCplots Results here
How much of an influence did our priors have on our estimate of the population mean?
What if we just plead complete ignorance? Like, non-ecologist knowledge of tree diameters.
tree_model02 <- nimbleCode({
## Priors ##
pop.mean ~ dunif(0,200) # the population mean has an equal probability of being any number between 0 and 200. 200 is arbitrary - what if we put 500? 1000?
pop.sd ~ dunif(0, 100) # the pop.sd has an equal probability of being any number between 0 and 100
# NOTE: before nimble, had to do something like this:
# pop.sd ~ dunif(0, 100)
# pop.var <- pop.sd * pop.sd
# pop.prec <- 1/pop.var # pass pop.prec to dnorm below
# likelihood
for(i in 1:nObs){
tree[i] ~ dnorm(pop.mean, sd = pop.sd)
}
})Since we changed prior distributions, need to modify the starting values that we are passing to nimble. Also need to pass starting value for pop.sd since we are no longer passing it in as data.
We are pretty safe with almost any starting value because of the super flat priors we specified - would be hard to mess up.
Below I will just pull random numbers from identical distributions to the priors
NIMBLENote that we are no longer passing tree_sd in as data - going to pull from priors.
# one call
samples02 <- nimbleMCMC(
code = tree_model02, # changed model name
constants = tree_constants,
data = tree_data,
inits = inits,
monitors = keepers,
niter = ni,
nburnin = nb,
thin = nt,
nchains = nc,
summary = T) # get jags-style summary of posterior## Defining model
## Building model
## Setting data and initial values
## Running calculate on model
## [Note] Any error reports that follow may simply reflect missing values in model variables.
## Checking model sizes and dimensions
## Checking model calculations
## Compiling
## [Note] This may take a minute.
## [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
## running chain 1...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## running chain 2...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## running chain 3...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## Mean Median St.Dev. 95%CI_low 95%CI_upp
## pop.mean 56.26674 56.23745 5.308168 45.83599 67.05120
## pop.sd 15.76228 14.76271 4.458699 9.64558 26.98514
# .......................................................................
# INSPECT RESULTS
# .......................................................................
# convert to mcmc object for inspection via coda package
samples_mcmc <- coda::as.mcmc.list(lapply(samples02$samples, coda::mcmc))
# Look at traceplots of the parameters
par(mfrow=c(1,2))
coda::traceplot(samples_mcmc[, 1:2])# calculate Rhat convergence diagnostic of parameters
# "Gelman-Rubin Statitsic" - compares ratio of the variation of the samples within a chain and the variation of samples when the chains are pooled; variation within and between chains should stabilize with convergence (i.e., go to 1)
# rule of thumb is anything <1.1
coda::gelman.diag(samples_mcmc) # we can now look at both mean and sd## Potential scale reduction factors:
##
## Point est. Upper C.I.
## pop.mean 1.01 1.03
## pop.sd 1.01 1.03
##
## Multivariate psrf
##
## 1.01
# extract mean and SD lambda of each grid cell
samplesdf <- do.call(rbind, samples_mcmc)
pop.mean <- samplesdf[, 1]
pop.sd <- samplesdf[, 2]
getValues(pop.mean)## mean sd lower95 median upper95
## 1 56.26674 5.308168 45.83599 56.23745 67.0512
## mean sd lower95 median upper95
## 1 15.76228 4.458699 9.64558 14.76271 26.98514
# MCMCplots is nice too
library(mcmcplots)
mcmcplot(samples02$samples, dir = here::here('Modules/02_Intro_Bayes/files_for_slides'), filename = "tree_model02")##
Preparing plots for pop.mean. 50% complete.
##
Preparing plots for pop.sd. 100% complete.
MCMCplots Results here
This is a good look at typical output from a Bayesian model. A bit messy, each chain is slightly different. - You should note increased uncertainty in our distributions. - Good news is that even pleading ignorance, our 95% CI of the mean overlaps truth. - Just to show that the “flatness” of the prior shouldn’t change things too much - flat is flat.
tree_model03 <- nimbleCode({
## Priors ##
pop.mean ~ dunif(0,1000) # the population mean has an equal probability of being any number between 0 and 200. 200 is arbitrary - what if we put 500? 1000?
pop.sd ~ dunif(0, 500) # the pop.sd has an equal probability of being any number between 0 and 100
# NOTE: before nimble, had to do something like this:
# pop.sd ~ dunif(0, 100)
# pop.var <- pop.sd * pop.sd
# pop.prec <- 1/pop.var # pass pop.prec to dnorm below
# likelihood
for(i in 1:nObs){
tree[i] ~ dnorm(pop.mean, sd = pop.sd)
}
})
# don't really need to change initial values - they will fall within acceptable values and will still be radndom
inits <- list(pop.mean = runif(n = 1, min = 0, max = 200),
pop.sd = runif(n = 1, min = 0, max = 100))
# data and constants
tree_data <- list(tree = tree_diameter)
tree_constants <- list(nObs = length(tree_diameter))
# params to monitor
keepers <- c('pop.mean', 'pop.sd')
# MCMC settings
nc = 3
nb = 1000
ni = nb + 2000
nt = 1
# one call
samples03 <- nimbleMCMC(
code = tree_model03, # changed model name
constants = tree_constants,
data = tree_data,
inits = inits,
monitors = keepers,
niter = ni,
nburnin = nb,
thin = nt,
nchains = nc,
summary = T) # get jags-style summary of posterior## Defining model
## Building model
## Setting data and initial values
## Running calculate on model
## [Note] Any error reports that follow may simply reflect missing values in model variables.
## Checking model sizes and dimensions
## Checking model calculations
## Compiling
## [Note] This may take a minute.
## [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
## running chain 1...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## running chain 2...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## running chain 3...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## Mean Median St.Dev. 95%CI_low 95%CI_upp
## pop.mean 56.34326 56.38719 4.993508 45.934104 66.23678
## pop.sd 16.03778 15.22397 4.639292 9.649754 27.53042
# this is important - let's talk through it.
# prior knowledge said 43-63
# we updated prior knowledge with new knowledge and reduced 95% CI.
# pop sd is exactly the same value as we passed it - does not get sampled - just a constant
# (HOW DOES 95% CRED INT differ from 95% CONF INT?)
# but also,
# .......................................................................
# INSPECT RESULTS
# .......................................................................
# convert to mcmc object for inspection via coda package
samples_mcmc <- coda::as.mcmc.list(lapply(samples03$samples, coda::mcmc))
# Look at traceplots of the parameters
par(mfrow=c(1,2))
coda::traceplot(samples_mcmc[, 1:2])# calculate Rhat convergence diagnostic of parameters
# "Gelman-Rubin Statitsic" - compares ratio of the variation of the samples within a chain and the variation of samples when the chains are pooled; variation within and between chains should stabilize with convergence (i.e., go to 1)
# rule of thumb is anything <1.1
coda::gelman.diag(samples_mcmc) # we can now look at both mean and sd## Potential scale reduction factors:
##
## Point est. Upper C.I.
## pop.mean 1.00 1.01
## pop.sd 1.01 1.02
##
## Multivariate psrf
##
## 1
# extract mean and SD lambda of each grid cell
samplesdf <- do.call(rbind, samples_mcmc)
pop.mean <- samplesdf[, 1]
pop.sd <- samplesdf[, 2]
getValues(pop.mean)## mean sd lower95 median upper95
## 1 56.34326 4.993508 45.9341 56.38719 66.23678
## mean sd lower95 median upper95
## 1 16.03778 4.639292 9.649754 15.22397 27.53042
# MCMCplots is nice too
library(mcmcplots)
mcmcplot(samples03$samples, dir = here::here('Modules/02_Intro_Bayes/files_for_slides'), filename = "tree_model03")##
Preparing plots for pop.mean. 50% complete.
##
Preparing plots for pop.sd. 100% complete.
MCMCplots Results here
# model
tree_model04 <- nimbleCode({
## Priors ##
pop.mean ~ dnorm(0, sd = 100) # still flat 'uninformative', but draws from a random normal
# to set up the standard deviation, we need to make sure that the values are not negative.
# If you remember, a gamma distribution has support from 0 -> Inf. This is a good choice.
# we'll set the gamma on the precision term and then transform that to something more natural to us - the std. dev.
prec ~ dgamma(0.1, 0.1)
pop.sd <- 1/sqrt(prec)
# likelihood
for(i in 1:nObs){
tree[i] ~ dnorm(pop.mean, sd = pop.sd)
}
})
# inits
inits <- list(pop.mean = rnorm(n = 1, 100, 10), # now rnorm
prec = rgamma(n = 1, 0.1,0.1))
# gather data and constants
tree_data <- list(tree = tree_diameter)
tree_constants <- list(nObs = length(tree_diameter))
# parameters to monitor
keepers <- c('pop.mean', 'pop.sd')
# MCMC settings
nc = 3
nb = 5000
ni = nb + 5000
nt = 1
# get samples
samples04 <- nimbleMCMC(
code = tree_model04, # changed model name
constants = tree_constants,
data = tree_data,
inits = inits,
monitors = keepers,
niter = ni,
nburnin = nb,
thin = nt,
nchains = nc,
summary = T)## Defining model
## Building model
## Setting data and initial values
## Running calculate on model
## [Note] Any error reports that follow may simply reflect missing values in model variables.
## Checking model sizes and dimensions
## Checking model calculations
## Compiling
## [Note] This may take a minute.
## [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
## running chain 1...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## running chain 2...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## running chain 3...
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
## Mean Median St.Dev. 95%CI_low 95%CI_upp
## pop.mean 56.20605 56.20693 4.713230 46.725845 65.62689
## pop.sd 14.58666 13.94124 3.764879 9.207031 24.14396
# convert to mcmc object for inspection via coda package
samples_mcmc <- coda::as.mcmc.list(lapply(samples04$samples, coda::mcmc))
# Look at traceplots of the parameters
par(mfrow=c(1,2))
coda::traceplot(samples_mcmc[, 1:2])# calculate Rhat convergence diagnostic of parameters
# "Gelman-Rubin Statitsic" - compares ratio of the variation of the samples within a chain and the variation of samples when the chains are pooled; variation within and between chains should stabilize with convergence (i.e., go to 1)
# rule of thumb is anything <1.1
coda::gelman.diag(samples_mcmc) # we can now look at both mean and sd## Potential scale reduction factors:
##
## Point est. Upper C.I.
## pop.mean 1 1
## pop.sd 1 1
##
## Multivariate psrf
##
## 1
library(mcmcplots)
mcmcplot(samples04$samples, dir = here::here('Modules/02_Intro_Bayes/files_for_slides'), filename = "tree_model04")##
Preparing plots for pop.mean. 50% complete.
##
Preparing plots for pop.sd. 100% complete.
MCMCplots Results here
White (1968) provides information of the weight of 12 adult male peregrines.